flowchart TD
A[OS User Land] --> B[OS Kernel]
C[Apptainer Layer] --> B
D[Your app] --> A
E[Your Container] --> C
Slurm Scheduler and Resource Manager
Aims
Actions
squeue, sinfo and scontrol.🤓 Google/Bing will help to find more
👍 User Forum at https://hpc-talk.cubi.bihealth.org!
Resource Manager
Job Scheduler
holtgrem_c@hpc-login-1$ srun --pty --time=2:00:00 --partition=training \
--mem=10G --cpus-per-task=1 bash -i
srun: job 14629328 queued and waiting for resources
srun: job 14629328 has been allocated resources
holtgrem_c@hpc-cpu-141$srun
squeue -u $USERscontrol show job 14629328squeue 🤸What is the output of squeue?
holtgrem_c@hpc-cpu-141$ squeue -u $USER
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
14629328 training bash holtgrem R 1:40 1 hpc-cpu-141More info with --long:
scontrol show job 🤸Let us look at scontrol show job 14629328
holtgrem_c@hpc-cpu-141$ scontrol show job 14629328
JobId=14629328 JobName=bash
UserId=holtgrem_c(100131) GroupId=hpc-ag-cubi(1005272) MCS_label=N/A
Priority=661 Nice=0 Account=hpc-ag-cubi QOS=normal
JobState=RUNNING Reason=None Dependency=(null)
Requeue=1 Restarts=0 BatchFlag=0 Reboot=0 ExitCode=0:0
RunTime=00:06:37 TimeLimit=02:00:00 TimeMin=N/A
SubmitTime=2023-07-11T15:17:37 EligibleTime=2023-07-11T15:17:37
AccrueTime=2023-07-11T15:17:37
StartTime=2023-07-11T15:17:53 EndTime=2023-07-11T17:17:53 Deadline=N/A
SuspendTime=None SecsPreSuspend=0 LastSchedEval=2023-07-11T15:17:53 Scheduler=Backfill
Partition=training AllocNode:Sid=hpc-login-1:3631083
ReqNodeList=(null) ExcNodeList=(null)
NodeList=hpc-cpu-141
BatchHost=hpc-cpu-141
NumNodes=1 NumCPUs=1 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
TRES=cpu=1,mem=10G,node=1,billing=1
Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
MinCPUsNode=1 MinMemoryNode=10G MinTmpDiskNode=0
Features=(null) DelayBoot=00:00:00
OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)
Command=bash
WorkDir=/data/cephfs-1/home/users/holtgrem_c
Power=The job in PENDING state
holtgrem_c@hpc-cpu-141$ scontrol show job 14629381
JobId=14629381 JobName=bash
UserId=holtgrem_c(100131) GroupId=hpc-ag-cubi(1005272) MCS_label=N/A
Priority=661 Nice=0 Account=hpc-ag-cubi QOS=normal
JobState=PENDING Reason=Priority Dependency=(null)
Requeue=1 Restarts=0 BatchFlag=0 Reboot=0 ExitCode=0:0
RunTime=00:00:00 TimeLimit=02:00:00 TimeMin=N/A
SubmitTime=2023-07-11T15:26:33 EligibleTime=2023-07-11T15:26:33
AccrueTime=Unknown
StartTime=Unknown EndTime=Unknown Deadline=N/A
SuspendTime=None SecsPreSuspend=0 LastSchedEval=2023-07-11T15:26:33 Scheduler=Main
Partition=short AllocNode:Sid=hpc-login-1:3644832
ReqNodeList=(null) ExcNodeList=(null)
NodeList=
NumNodes=1 NumCPUs=1 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
TRES=cpu=1,mem=10G,node=1,billing=1
Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
MinCPUsNode=1 MinMemoryNode=10G MinTmpDiskNode=0
Features=(null) DelayBoot=00:00:00
OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)
Command=bash
WorkDir=/data/cephfs-1/home/users/holtgrem_c
Power=The job while running on a node:
holtgrem_c@hpc-cpu-141$ scontrol show job 14629381
JobId=14629381 JobName=bash
UserId=holtgrem_c(100131) GroupId=hpc-ag-cubi(1005272) MCS_label=N/A
Priority=661 Nice=0 Account=hpc-ag-cubi QOS=normal
JobState=RUNNING Reason=None Dependency=(null)
Requeue=1 Restarts=0 BatchFlag=0 Reboot=0 ExitCode=0:0
RunTime=00:05:04 TimeLimit=02:00:00 TimeMin=N/A
SubmitTime=2023-07-11T15:26:33 EligibleTime=2023-07-11T15:26:33
AccrueTime=2023-07-11T15:26:33
StartTime=2023-07-11T15:26:53 EndTime=2023-07-11T17:26:53 Deadline=N/A
SuspendTime=None SecsPreSuspend=0 LastSchedEval=2023-07-11T15:26:53 Scheduler=Backfill
Partition=short AllocNode:Sid=hpc-login-1:3644832
ReqNodeList=(null) ExcNodeList=(null)
NodeList=hpc-cpu-144
BatchHost=hpc-cpu-144
NumNodes=1 NumCPUs=1 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
TRES=cpu=1,mem=10G,node=1,billing=1
Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
MinCPUsNode=1 MinMemoryNode=10G MinTmpDiskNode=0
Features=(null) DelayBoot=00:00:00
OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)
Command=bash
WorkDir=/data/cephfs-1/home/users/holtgrem_c
Power=The job “just” after being terminated:
holtgrem_c@hpc-cpu-141$ scontrol show job 14629381
JobId=14629381 JobName=bash
UserId=holtgrem_c(100131) GroupId=hpc-ag-cubi(1005272) MCS_label=N/A
Priority=661 Nice=0 Account=hpc-ag-cubi QOS=normal
JobState=COMPLETED Reason=None Dependency=(null)
Requeue=1 Restarts=0 BatchFlag=0 Reboot=0 ExitCode=0:0
RunTime=00:07:52 TimeLimit=02:00:00 TimeMin=N/A
SubmitTime=2023-07-11T15:26:33 EligibleTime=2023-07-11T15:26:33
AccrueTime=2023-07-11T15:26:33
StartTime=2023-07-11T15:26:53 EndTime=2023-07-11T15:34:45 Deadline=N/A
SuspendTime=None SecsPreSuspend=0 LastSchedEval=2023-07-11T15:26:53 Scheduler=Backfill
Partition=short AllocNode:Sid=hpc-login-1:3644832
ReqNodeList=(null) ExcNodeList=(null)
NodeList=hpc-cpu-144
BatchHost=hpc-cpu-144
NumNodes=1 NumCPUs=1 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
TRES=cpu=1,mem=10G,node=1,billing=1
Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
MinCPUsNode=1 MinMemoryNode=10G MinTmpDiskNode=0
Features=(null) DelayBoot=00:00:00
OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)
Command=bash
WorkDir=/data/cephfs-1/home/users/holtgrem_c
Power=After some time, the job is not known to the controller any more…
… but we can still get some information from the accounting (for 4 weeks) …
holtgrem_c@hpc-cpu-141$ sacct -j 14629381
JobID JobName Partition Account AllocCPUS State ExitCode
------------ ---------- ---------- ---------- ---------- ---------- --------
14629381 bash short hpc-ag-cu+ 1 COMPLETED 0:0
14629381.ex+ extern hpc-ag-cu+ 1 COMPLETED 0:0
14629381.0 bash hpc-ag-cu+ 1 COMPLETED 0:0You can use sacct -j JOBID --long | less -SR to see all available accounting information.
Sadly, it failed:
holtgrem_c@hpc-login-1$ scontrol show job 14629473
JobId=14629473 JobName=first-job.sh
UserId=holtgrem_c(100131) GroupId=hpc-ag-cubi(1005272) MCS_label=N/A
Priority=761 Nice=0 Account=hpc-ag-cubi QOS=normal
JobState=FAILED Reason=NonZeroExitCode Dependency=(null)
Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=1:0
RunTime=00:00:00 TimeLimit=2-00:00:00 TimeMin=N/A
SubmitTime=2023-07-11T15:44:31 EligibleTime=2023-07-11T15:44:31
AccrueTime=2023-07-11T15:44:31
StartTime=2023-07-11T15:44:54 EndTime=2023-07-11T15:44:54 Deadline=N/A
SuspendTime=None SecsPreSuspend=0 LastSchedEval=2023-07-11T15:44:54 Scheduler=Backfill
Partition=medium AllocNode:Sid=hpc-login-1:3644832
ReqNodeList=(null) ExcNodeList=(null)
NodeList=hpc-cpu-219
BatchHost=hpc-cpu-219
NumNodes=1 NumCPUs=1 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
TRES=cpu=1,mem=1G,node=1,billing=1
Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
MinCPUsNode=1 MinMemoryCPU=1G MinTmpDiskNode=0
Features=(null) DelayBoot=00:00:00
OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)
Command=/data/cephfs-1/home/users/holtgrem_c/first-job.sh
WorkDir=/data/cephfs-1/home/users/holtgrem_c
StdErr=/data/cephfs-1/home/users/holtgrem_c/slurm-14629473.out
StdIn=/dev/null
StdOut=/data/cephfs-1/home/users/holtgrem_c/slurm-14629473.out
Power=Troubleshooting our job failure:
More troubleshooting hints:
scontrol | grep Reason
WorkDir exist and do you have access?
cd $WorkDirStdOut/StdErr log files, if any.sacct -j 14629473 --format=JobID,State,ExitCode,Elapsed,MaxVMSize to look for hints regarding running time/memory (VM) sizesrun/sbatchWe can explicitely allocate resources with the srun and sbatch command lines:
--job-name=MY-JOB-NAME: explicit naming--time=D-HH:MM:SS: max running time--partition=PARTITION: partition--mem=MEMORY: allocate memory, use <num>G or <num>M--cpus-per-task=CORES: number of cores to allocateWrite a job script that …
sleep 1m (hint: how can you figure out the maximal memory used?)job-1.sh that triggers job-2.sh on completion (is this useful? dangerous?)Use online resources to figure out the right command line parameters.
Use the following commands and use the online help and man $command to figure out the output:
sdiagsqueuesinfoscontrol show node NODEsprio -l -S -yUse the following commands and use the online help and man $command to figure out the output:
sdiag
squeue
sinfo
scontrol show node NODE
sprio -l -S -y
Provoke the following situations:
In each case, look at scontrol/sacct output and look at log files.
squeue OutputTODO
e.g.,
squeue -o "%.10i %9P %60j %10u %.2t %.10M %.6D %.4C %20R %b" "$@"
use -u $USER
use |less -S
Your turn 🤸: look at man squeue and find your “top 3 most useful” values.
.bashrc Aliasesalias sbi='srun --pty --time 7-00 --mem=5G --cpus-per-task 2 bash -i'
alias slurm-loginx='srun --pty --time 7-00 --partition long --x11 bash -i'
alias sq='squeue -o "%.10i %9P %60j %10u %.2t %.10M %.6D %.4C %20R %b" "$@"'
alias sql='sq "$@" | less -S'
alias sqme='sq -u holtgrem_c "$@"'
alias sqmel='sqme "$@" | less -S'set -x and set -vscancelsqueueYour turn 🤸: submit a job, cancel it, look at scontrol and sacct output.
sacctmgrTODO: explain
holtgrem_c@hpc-login-1$ sacctmgr show qos -p | cut -d '|' -f 1,19,20 | column -s '|' -t
Name MaxWall MaxTRESPU
normal cpu=512,mem=3.50T
debug 01:00:00 cpu=1000,mem=7000G
medium 7-00:00:00 cpu=512,mem=3.50T
critical cpu=12000,mem=84000G
long 14-00:00:00 cpu=64,mem=448G
highmem
gpu-interactive 01:00:00
short 04:00:00 cpu=2000,mem=14000G
gpu 7-00:00:00
staging 14-00:00:00 cpu=4000,mem=28000G
sbatchYour turn 🤸: write two jobs with -d afterok:JOBID
Your turn 🤸: start xterm if you have an local X11 server.
holtgrem_c@hpc-login-1$ scontrol show reservation
ReservationName=svc-bih-cubi-demux_c_6 StartTime=2022-08-26T12:54:31 EndTime=2023-08-26T12:54:31 Duration=365-00:00:00
Nodes=hpc-cpu-207 NodeCnt=1 CoreCnt=16 Features=(null) PartitionName=(null) Flags=SPEC_NODES
TRES=cpu=32
Users=svc-bih-cubi-demux_c Groups=(null) Accounts=(null) Licenses=(null) State=ACTIVE BurstBuffer=(null) Watts=n/a
MaxStartDelay=(null)
A challenge? Some options:
make install
sudo apt install
root on the HPCConda is an open-source, cross-platform, language-agnostic package manager and environment management system.
– Wikipedia
Conda allows you to:
bioconda channelPlus, it integrates well into Snakemake (more about that later)
Use the following steps for installation:
# on login node
srun --partition=training --mem=5G --pty bash -i
# on a compute node
wget -O /tmp/Miniforge3-Linux-x86_64.sh \
https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh
mkdir -p $HOME/work/miniconda3
ln -sr $HOME/work/miniconda3 $HOME/miniconda3
bash /tmp/Miniforge3-Linux-x86_64.sh -s -b -p $HOME/work/miniconda3Configure:
conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge
conda config --set channel_priority strict
cat ~/.condarcNow you can activate it with
Creating an environment:
mamba create --yes --name read-mapping bwa samtools
conda activate read-mapping
## or: source ~/miniconda3/bin/activate read-mappingShowing what is installed:
.sif files with Apptainer.sif files from Docker containers.sif files from scratchApptainer (fka Singularity) is a container system for HPC.
What are containers?
flowchart TD
A[OS User Land] --> B[OS Kernel]
C[Apptainer Layer] --> B
D[Your app] --> A
E[Your Container] --> C
➡️ Reproducible, transferrable, application installations
.sif Image Files.sif Files.sif Images from Scratch🫵 Where can you apply what you have learned in your PhD project?
… but all for this session
Recap